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5 Data Compression Using Matching Pursuits 

■ The present invention relates to data compression techniques, and in particular to 
techniques which xise the matdiing pursuits algorithm. The invention is 
particularly although not exchisively applicable to the field of video and still image 
10 compression. 



• 



The transform known as xnatching pursuits was introduced by Mallat & Zang, in 
their paper "Matching Pursuits with Time-Frequency Dictionaries", IEEE 
Transactions on Signal Processing, volume 41, 12 December 1993. A significant 

15 amount of research has been carried out subsequent to pubUcation of that paper 
with a view to applying the matohing pursuits algorithm to the compression of 
video images and audio data, as exemplified by Neff & Zakhor, "Very low bit rate 
video coding based on Matching Pursuits". IEEE Transactions on Circuits andt 
Systems for Video Technology, volume 7. number 5. October 1997. pages 158-171; 

20 see also their US patent US-A-5699121. While the transform has proved 

extremely effective, its practical appUcation has been Umited, primarily because it 
requires a large amount of computation and is dierefore relatively slow. 
Conventional thinking is that the transform is unlikely to be useable in practical 
real-time video coding systems for some years to come, until the transform has 

25 been sufficiently optimised or hardware speeds sufficiently increased. One 

approach however, that has suggested using matching pursuits in the content of the 

encoding of speech is tiiat described by Rezaiifer and Jaferkhani in their paper 
"Wavelet Based Speech Coding Using Orthogonal Matching PursuU" . Proc. 29 
Int. Conf. on Information Systems (CISS-95). pp88-92. Mar 1995. 

30 



The invention also dispels controversial thinking matching pursuit must always be 
computationally intensive. 

According to one aspect of the invention there is provided a method of data . 
compression comprising applying a transform to multi-dimensional data to 
generate a multi-dimensional transform data set, and coding the transform data set 
by applying one or more one-dimensional matching pursuits algorithms. 

Preferably, a plurality of one-dimensional matching pursuits algorithms are used, 
each in a different scazming direction through the data. The scan directions may 
(but need not) be orthogonal. There may be a single one-dimensional matching 
pursuits algorithm per dimension of the data, or there may be fewer: in other words 
we may use one or more matching pxzrsuits algorithms up to the number of 
dimensions of the data. 

According to another aspect of the invention there is provided a method of data 
compression comprising: 

(a) applying a transform to multi-dimensiorial data to generate a multi- 
dimensional transform data set; 

(b) convolving the transform data set with each of a plurality of first one- 
dimensional basis functions to generate a corresponding plurality of 

» convolved data sets; 

(c) determining a location in a first direction across all the convolved 
data sets, and a first basis fimction, representative of a greatest 
magnitude; 

(d) convolving the transform data at the said location with each of a 
plurality of second one-dim^isional basis functions; 



(e) determining a second basis function representative of a greatest 
magnitude; 

(f) representing part of the transform data surrounding the said location 
with an atom derived from tiie first and second basis functions 
corresponding to the greatest detennined magnitudes; 

(g) subtracting the atom from the transform data set to raeate a new data 
set; 

Oi) repeatedly updating the convolved data sets by convolving any 

changed part of the transform data set with each of the plurality of 

first one-dimensional basis functions, and then re-applying steps (c) 

to (f); and 

(i) ou^utting as quantized transform data coded versions of the atoms 
derived at step {f). 

According to another aspect there is provided a method of data compression 
conq)rising: 

(a) applying a transfonn to multi-dimensional data to generate a multi- 
dimensional transform data set; 

(b) convolving the transform data set with each of a plurality of first one- 
dimensional basis fimctions to generate a corresponding plurality of 
convolved data sets; 

(c) determining a first location in a first direction across all the 
convolved data sets, and a first basis function representative of a 
greatest magnitude; and representing part of the transform data 
surrounding the first location with a first atom derived from the first 
function corresponding to the greatest determined magnitude; 



(d) subtracting the first atom from the transform data set to create a new 
data se^ 

(e) convolving the new data set wllii each of a pluiaUty of second one- 
dimensional basis functions; 

• • • • 

(f) determining a second location in a second direction across aU the 

convolved data isets, and a second basis function representative of a 
greatest magnitude; and representing part of the new data set 
surrounding a second location with a second atom derived from the 
second function corresponding to the greatest determined magnitude; 

(g) subtracting the second atom from the new date set to create a fijrther 
new data set; 

(h) repeating step (b) with the further new data set, and then re-applying 
steps (c) to (f); and ^ 

(i) outputting as quantized tiansform data coded versions of the atoms 
derived at steps (c) and (f). 

According to another aspect there is provided a coder for data compression 
comprising means for applying a transform to time-varying data to generate a 
multi-dimensional transform data set, and a coder for coding the transform data 
set by applying a pluraUty of one-dimensional matching pursuits algorillmis, 
one for each dimension. 



According to another aspect there is provided a coder for data compression 
comprising: 

(a) means for applying a transform to multi-dimensional data to 
generate a multi-dimensional transform data set; 



(b) means for convolving Ihe transform data set with each of a 
plurality of first one-dimensional basis functions to generate a 
corresponding plurality of convolved data sets; 

(c) means for determioing a location in a first direction across all 
the convolved data sets, and a first basis fimction 
representative of a greatest magnitude; 

(d) means for convolving the transform data at the said location 
with each of a plurality of second one-dimensional basis 
functions; 

(e) means for determining a second basis function represOTtative 
of a greatest magnitude; 

(f) means for representing part of the transform data surrounding 
the said location with an atom derived from the first and 
second basis functions corresponding to the greatest 
determined magnitudes; 

(g) means for subtracting the atom firom the transform data set to 
create a new data set; 

(h) means for repeatedly updating the convolved data sets by 
convolving any changed part of the transform data set with 
each of the plurality of first one-dimensional basis functions; 
and 

(i) means for ou^utting as quantized transform data coded 

versions of the derived atoms. 

• * 

According to another aspect there is provided a coder for data compression 
comprising: 



means for applying a transfonn to multi-dimensional data to generate 
a mutti-dimensional transform data set; 

means for convolving the transform data set with each of a plurality 
of first one-dimensional basis functions to generate a corresponding 
plurality of convolved data sets; 

means for detemiining a first location in a first direction across all the 
convolved data sets, and a first basis fimction representative of a 
greatest magnitude; and representing part of the triform data 
surrounding the first location with a first atom derived from the first 
fimction corresponding to the greatest determined magnitude; 
ineans for subtracting the first atom &om the transform data set to 
create a new data set; 

means for convolving the new data set with each of a plurality of 
second one-dimensional basis fimctions; 

means for determining a second location in a second direction across 
aU the convolved data sets, and a second basis fimction representative 
of a greatest magnitude; and representing part of the new data set 
surrounding a second location with a second atom derived &om the 
second fimction corresponding to the greatest determined magnitude; 
means for subteacting the second atom from the new data set to areate 
a fiirther new data se^ 

means for repeating step (b) with the fiirther new data set, and then 
re-applying steps (c) to (f); and 

means for outputting as quantized transform data coded versions of 
the atoms derived at steps (c) and (f). 



5 The invention further extends to a codec including a coder as previously described. 
It fiirfher ejctends to a computer program for carrying out the method described, 
and to a machine-readable data carrier which carries such a computer program. 

In the preferred method, the transform consists of or includes a decorrelating 
10 transform and/or a frequency based transform. 

In ^plying the matching pursuits algorithm, the mechanism for convolving the 
transform data set with each of the pluraUty of the bases is not critical. Typically, 
% this may be achieved by calculating the inner product of each of the bases with 

15 every possible position (data point) in the transform data set However, less 

accurate methods of locating the position may also be used. Likewise, the position 
where the inner product is greatest may be detennined in any convenient way, for 
example by searching. Preferably, a small portion of the data around the relevant 
point is then represented by the basis function at that position multiplied by a 
20 coefficient which has the same sign as the selected inner product, and the square 
root of its magnitude. 

The position having the greatest magnitude may be determmed by taking the 
absolute magnitude (tiiat is, relative to to zero). Alternatively, the position of 
^ 25 greatest magnitude may be detennined after the appUcation of a function across the 

data which may represent a sensory or psychophysical model such as a 
psychoacoustic or psychovisual model representative of the peic^tual importance 
of the data. The function map may, but need not, define threshold values which are 
subtracted from the data before the position of greatest magnitude is deteraiined. 
30 Alternatively, the function map may be used as a multiplier to tiie data, or 
combined with it in some other way. 
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The method of the present invention may be used to compress both two- 
dimensional data (for example stiD images), as well as three-dimensional data (fo 
example moving images with some con^ression in the time dimension). When 
three-dimensional data is to be compressed, a two-dimensional Hansfonn may be 
used, followed by three one-dimensional matching pursuits algorithms. 



In one embodiment of the iiiventipn, the whole or part of the transform data set (for 
example a sub-band of the data set) may be scanned in the direction in which the 
data is most correlated, and one-dimensioimlmatching pursuits may be applied to 
15 tiie data so scanned. 

The invention may be put into practice in a number of ways and several specific 
embodiments will now be described, by way of example, with reference to the 
accompanying drawings in which: 

20 

Figure 1 illustrates a first embodiment of the present invention in which 
indQ)endent onc-dim«isional atoms are used; 

Figure 2 illustrates a second embodhnent in which two one-dimensional atoms are 
used to generate a two-dimensional atom; 
25 Figure 3 Ulustrates a generic method of video coding; 

Figure 4 shows in schematic form a video coder according to an embodiment of the 
present invention; 

. Figure 5 shows a decoder corresponding to the encoder of figure 2; and 
. Figure 6 illustrates an appHcation of matching pursuits to the two-dimensional ' 
30 output of a- wavelet transform. 



5 Before describing the specific embodiments in detail, it may be worfb sijmmarising 
the operation of the matching pursuits transform. SpecificaUy, we will summarise 
the way in which a 2D transform may be used to compress a 2D block of data, such 
as a still image. 

10 Matching pursuits in the 2D case uses a library of 2D basis fiinctions, typically 
normalized Gabor fvmctions although other fimctions are equally possible. To 
encode the image, the transform forms the inner product of aU the bases wilh every 
possible data point This is equivalent to convolving the data with each and every 
basis function. Locally, wherever the basis function resembles the data, peaks will 

15 occur wi&in the inner prodi3Ct The results are then searched for the basis function 
that ^ves the inner product of largest magnitude: we can then represent a small 
portion of the data by the basis function at that position, multipUed by a coefficient 
which has the same sign as the selected inner product and the square root of its 
magnitudCr 

20 

This gives what is known as an "atom". The code for the atom is the amplitude and 
the position within ttie data set (e.g. image), along with the number of the 
corresponding basis function. 

^ 25 The atom just found is then subtracted from the image, giving a modified image 

which represents the so-fer uncoded portion of the data. Ihis process is dien 
iteratively repeated, to find additional atoms. At each iteration, a search is carried 
out for the basis fimction feat gives the inner product of largest magnitude. Of 
course, it is necessary to update the convolution only where subtraction of the basis 
30 function has changed them. 
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The atoms found at each iteration are simply summed to create the encoded version 
of the data. When sufBcient atoms have been found to represent the original image 
at some desired level of fidelity, the resultant Ust of atoms constitutes the 
compressed codcThis code can be arranged and entropy coded, if required, to 
reduce its size. 

The one-dimensional matching pursuits transform is similar, except of course that 
the code book Consists of ID rather than 2D functions. ID matching pursuits has 
been appKed as a transform to raw audio data with promising results, although, as 
mentioned above, the feet that the transform is computationally intensive has until 
now severely limited its usabihty in practical real-time applications. 

The operation of the invention according to a first embodiment is shown 
schematically in figure 1. Here, some sort of transform Qjieferably a frequency 
transform) has been applied to a multidimensional data set (not shown) to create a 
multidimensional transform data set 10. For the sake of illustration only, the data 
set 10 is a two-dimensicmal data set having axes x,y. 

To apply matching pursuits to this data set, the data is raster-scanned in the x 
direction, and the location 12 is determined at whidi one finds an atom of greatest 
magnitude. Position, ampUtude and code book caitry are recorded. The atom just 
found is then subtracted - typically after quantization of the amplitude - from the 
transform data set 10 to create a modified data set 10a. Next, the data set 10a is 
raster scanned in the y direction and the process repeated to fiad the best y atom at 
the fiirther location 14 (which may not be the same as location 12). 
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To encode a two-dimensional image, this process is simply repeated, with the scans 
being taken alternately in the x and in the y directions. While the scans are 
preferably orthogonal, it is not essential that they be horizontal and vertical, and m 
some applications it may be preferable to use alternating raster scans which go 
diagonally. 

This embodiment may easily be extended to higher dimensional data sets. For 
example, a three-dimensional data set may be encoded with alternating x,y,z scans, 
with an appropriate one-dimensional atom being selected at each of the scans. 
When the data set is representative of a video stream in x,y,t, the same approach 
may be used with the t axis being treated in exactly the same way as the z axis 
above. In other words, the t axis may be treated as if it were an independent spatial 
axis. Time-varying three-dimensional data in x,y,z,t may also be treated in a 
similar way, with repeated scans being made in x,y,z,t, x,y,z,t and so on. As mth 
the two-dimensional case, wdiile it is preferred that the scans are made in mutually 
orthogonal directions, it is not essential for the axes to be those previously 
mentioned. In some embodiments, raster scanning across diagonal planes may be 
preferred. 

The code book xised for each one-dimensional scan may be unique, or alternatively 
the same code book may be used for scans in more than one direction. It maybe 
desirable for a first code book to be used for scans in the spatial dimensions, and 
for a second code book to be used for scans in the time dimension. 

It is not essential for each raster scanned to be taken over the entirely of the data set 
to the encoded. Where desirable, the data set may be partitioned before scanning is 
undertaken, with the scans being carried out on each partition separately. The 
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partition may be the spatial partition, a temporal partition^ a frequency partition, or 
any other type of partition that may be convenient according to the type of data 
being processed and the particular transform that is in use. ID scanning may be 
done in different directions in each region. 

An alternative and rather more sophisticated approach is shown in figure 2. Here, 
scanning is first carried out in the x direction 20 and the location 22 of the best- 
fitting x-atom 22 is determined as before. Next, an orthogonal scan 24 in the y 
direction is carried out, not across the whole data set but instead just locally in the 
region of the atom 22. At this location, the best y-atom is then selected. The x- 
atom and the y-atom together define a single (separable) two-dimensional 
transform. The amplitude is quantized;, and the atom is subtracted fi:om the data set 
to create a modified data set on which the procedure may be repeated. Because die 
second scan is taken in a directioii orthogonal to the first, it will be understood that 
the location of the x-atom 22 becomes spUt up in the y-axis output stream as shown 
in the lower part of the figure at 22a, 22b, 22c- The "repair" of the data set 
following the subtraction therefore needs to be carried out at multiple places within 
the y-axis stream. 

The same approach may be used in more than two dimensions. So, where the 
original data set is representative of a three-dimensional model in x,y,z, three 
separate one-dimensional scans may be used to generate a single 3D atom. 
Likewise, for encoding a video stream in x,y,t, the t-axis may simply be treated as a 
third spoiM dimension to create a single 3D atom in x,y,t. Where a sequence of 
video images is to be encoded, a suitable transform is first applied^ as discussed 
above, to create a three-dimensional transform data set (in space and time). Then, 
; on the three-dimensional data set we carry out three matching pmrsuits.separable 
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5 searches, eadi in a different orthogonal direction. So, first, we may for example 
carry out a raster scan through the data in the x direction, using a first one- 
dimensional code book, and look for the best matched location for each atom 

within that code book. Once the best x atom and location have been found, we 
then use matching pursuits again, but this time looking m the y direction only and 

1 0 usmg a separate y-code book. It is not necessary to scan the entire data set agam, 
in the y direction, as we have found that in practice restricting the y search to a 
smaU area at or very near the previously-identified best location can stiU provide 
good results v^^e substantially reducing computational overhead, 

1 5 Once the best y atom has been located, the process is repeated, this time using a 

one-dimensional code book based upon tiie time dunension. As before, the t search 
may be restricted to an area at or close to the best locations previously found m the 
X and/or y searches. It has been found in practice that the best results can be 
obtained by using separate matching pursuits code books for the x. y and t 

20 searches. However, where appropriate, a common code book may be used either . 
just for the X and y dhections, or for all three dkections. 

Once aU thi«e one-dimensional atoms have been identified, an entire three- 
dimensional block around the preferred location can th«i be reconstnicted. We 

25 recompute (using the inner product) to reconstruct the adjusted data (including the 
amplitude, since the "amplitude" of each individual direction is not sufficient, m 
itself to calculate the actual model amplitude). 

The modelled data is then quantized and subtracted from die original data. A note 
30 is made of the three one-dimensional atoms for fiiture reference, and the entire 
process is then repeated on tiie reduced data set This continues until die data set 
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has been reduced sufficiently, accordingly to the requirements of the particular 
application. 

The same approach may be applicable to encode time-varying Ihree-dimensiohal 
data, by means of single 4D atoms constructed fiom sq)arable x,y,z and t atoms. 

While the above sets out a variety of possible options, the most preferred 
implementation is as follows: after the multidimensional transform we simply scan 
the data in any desired ID readout order. We then code the ID scan with ID 
Matching Pmsuits. 

This may be repeated by re-scanning in some other readout order (which may but 
need not be orthogonal to the first). Thus, we typicaUy use one or more matching 
pursuits algorithms to a maximum of one per diinensioii of the data. 

Where a wavelet transform has been used, the x code book may include atoms 
which define firequency, phase, attenuation, amplitude constant size; the y code 
book may define slew, attenuation, size and ampUtude.(fi«quency need not to be 
considered since that has aheady been decided); and the t code book may define 
time-slew, attenuation and size. 

A convenient fimction that may be used to define the individual one dimensional 
atoms is the Gabor function. In the x dkection, that is given hyf(x). where: 



F(x) =^Acos(ojx + 0) e'^ 
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Similar Gabor functions G(y) and H(t) may be used for the y and t direcstions. It 
will be understood, of coxurse, that the amplitude, phase shift and attenuation 
constants may be different in each of the three directions. 

The three matches are not necessarily carried out in the oidar descsribed above, and 
it may sometimes be sufficient or convenient to match in some other order, for 
example t, x, y, or x, t, y. 

In tib© preferred embodiment, each ID matching pursuit algorithm is implemented 
by calculating flie nmer product of each of the available bases with every 
considered position (data point) in the transform data set The position where the 
inner product as ttie greatest absolute magnitude is then determined in any 
convenient way, for example by searching. As mentioned above, the second and 
third ID matches do not require any searching in space at all, although in some 
circumstances it may be convenient to carry out a small ID (or 2 or 3D) search in .. 
the vicinity of the 'best' location found by the previous matdies Alternatively, 
instead of looking for the position having the greatest absolute magnitude (that is, 
relative to zero) some otiier measure of magnitude may be used instead. Where a 
model such as a psychoacoustic or a psychovisual model is in use, the position of 
greatest magnitude may be chosen wilix referaice to fliat underlying model. One 
way of doing fliat is to apply the model as a weighting over the entire set of data 
before determining the position of greatest magnitude; alternatively, anoth^ 
approach is to view the model as defining thresholds, in vMch case the thresholds 
may be subtracted from the inner products follovmig vM(^ the difference of 
greatest magnitude is sought It will be understood, of course, that other 
approaches are possible. In general, a psychoacoustic, psychovisual or other model 
maybe appUed as a function map across the transform data set, with the position of 
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5 greatest magnitude being located across the transformed data as modified by the 
mapped function. 

In any of llie embodiments described above, instead of alternating between 
scanning directions an approach we call "agile scanning" may be Used instead. We 
10 start by scanning separately in each possible direction, and determining the best 
possible atom in each of those directions. The amplitude (magnitude) of each of 
those atoms of is stored. Next, we rqjeatedly scan the diannel (direction) of the 
highest magnitude unta that channel generates an atom having a magnitude which 
is less than that of the stored magnitude of one of the other channels. Then, we 

15 switch to the channel which cunently contains the atom of greatest niagnitude and 
repeatedly scan that in the same way. We switch again when that channel no 
longer graierates an atom of highest magnitude. 

If the most recently scanned channel generates an atom of an identical magnitude 
20 to one which has abeady been found in another channel, we prefer tiiat the 

channels should be switched. Alternatively, however, it would equally be possible 
never to switch channels in such a situation. 

This approach is particularly efficient, since it allows the encoder to concentrate on 
25 obtaining "quick gains" in one channel, and automatically to switch to anolher 

channel as soon as it becomes optimal to do so. Since the rules used are causal, the 
state of the mcoder can continually be tracked by corresponding rules set up within 
the decoder without the need to transfer any status bits. 

30 The preferred embodimente, described above, provide for the first time the promise 
of a full 3D matching pursuits video coder that does not require the use of motion 
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vectors as in conventional motion compensated video coding. Tte need for motion 
vectors is effectively eliminated by the temporal aspect of tiie atoms. This has the 
effect of making fiiUy-scalable video coding a real possibility for the first time. 

Of course, it is by no means exckided that motion compensation may still be used 
when desired. In such an embodiment, appUcable to both time-varying 2D and 3D 
data, once one or more spatial atoms have been determined, those atoms may then 
be copied into (or moved along) the time dimension by some prediction 
mechanism such as for example the use of motion vectors. This approach will now 
be discussed in more detail, wMi a view to illustrating how embodiments of the 
present invention maybe incorporated within a motion-compensated codec. 

To set the scene for these specific embodiments, we wiU next describe, briefly, 
some standard motion-compensated video compression techniques. 

Video compression is divided into two basic categories: motion-compensated and , 
non motion-compensated. When individual fi^es are compressed without 
reference to any other firames, tiie compression is described as "intra-coded". One - 
of the advantages of intra-ooded video is that there is no restriction on the editing 
which can be carried out on tiie image sequence. As a result, most digital video in 
the broadcasting indusixy is stored in this way at source. The intra-codmg 
approach can be used m association wilh any of a large number of still image 
compression techniques such as, for example, the industry standard JPEG 
compression scheme. This approach is taken by the moving JPEG standard for 
video compression: JPEG compression is used for each of tiie individual fiames, 
with each of tiie frames being handled independentiy and witiiout reference to any 
other fi^ame. 
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Video sequences are not, however, lypicaUy composed of a collection of entirely 
unrelated images, and greater compression can nonnaUy be obtained by taldng 
account of the temporal redundancy in the video sequence. This involves a process 
known as inter-coded compression. With this approach, mdividual images in the 
10 output sequence may be defined with reference to changes that have occuned 

between that image and a previous image. Since the compressed data stream (sent 
across the video channel for reconstruction by the decoder) typicaUy represents 
information taken fiom several fiames at once, editing on the compressed data 
stream is not normally carried out because the quality is severely compromised. 

15 

Inter-coded compression is one of the compression techniques that is incorporated 
into the MPEG video compression standard. 

m 

A typical inter-coded compression scheme is shown schematically in Figure 3. In 
20 that Figure, the iq)per row 0 represents the original digitised video frames that are 
to be compressed, the second row C represents the compressed images, and the 
bottom row R the residuals. 

In the scheme shown, selected original fi^es S are treated as still images, and are 
: compressed by any convenient method to produce intra-frames 1 . These frames 
are then used as reference frames to create predicted frames P. The contents of 
these frames are projected from one or more I frames - either forwards or 
backwards in the sequence. This is normally achieved by the use of motion 
vectors, associated with moving blocks within the" image. Alternatively, the 
movement of specific physical objects within the image may be determined and 
predicted. Finally, the C sequence is completed by generating interpolated frames 
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B between the P and I frames. The original video sequence can then be 
approximated by the sequential frames of the C sequence, namely the I, B and P 
frames. In practice, fiirther corrections noimaUy have to be made if the end result 
is to appear reasonable. These further corrections are achieved by determining a 
residual frame R corresponding, in each case, to the difference between the original 
frame and the corresponding compressed frame. Residual frames may, but need 
not, be calculated for the intra frames. Accordingly, the residual frames marked X 
may sometimes be omitted. 

In a practical embodhnent, an encoder calculates the I frames from the these 
original frames labelled S m Ihe diagram, and, from that, calculates the motion 
parameters (vectors) that are needed to define Ihe P frames. The data stream 
transmitted from the encoder to the decoder thus includes the encoded I frames and 
Ihe appropriate motion vectors enabling the decoder to construct the P frames. 
Information on the B fiames is not sent, since those can be reconstructed by the, 
decoder alone purely on the basis of the information within tiie I and P frames, hi 
order to improve the final result, the data stream also includes the residual images, 
sent on a frame by frame basis. Suxce the residual unage represents the difference 
between the original image and flie compressed image, the encoder needs to have 
access to the sequence of compressed . images. That is achieved by incorporating an 
additional decoder within the encoder. 

Ite final data stream, as sent, therefore includes the fiill I frames, the motion 
vectors for the P frames and all of the residual frames possibly excluding those that 
are labelled X in Figure 1. Each residual image is typically compressed before 
transmission. 
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5 Numerous transforms, including matching pursuits, are known in tiie art for 
compressing the original S frames to produce the Intra frames. It has also been 
suggested, in the Neflf and Zachor paper mentioned previously, that matching 
pursuits may be used to encode the residual images. 

• m 

1 0 In contrast, in the prefened embodiment, the raw images are transformed by means 
of any standard transfonn, and the output of the transform is then quantized using 
the matching pursuits algorithm. The same appUes to any residual images: mstead 

pursmts as a transform to the residual image, the residual 
image is instead first transformed using a standard transfoim, and flie output of that 

15 transform is then quantized using matching pursuits. In both cases, the mitial 

transform which operates on the data itself may for example be an FFT, a wavelet 

transform, a DCT or a lapped orthogonal transform. Other tiansfonns could also be 
used 
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Where motion vectors are to be used, the methods discussed above may be 
incorporated within a motion-compensated hardware or software encoder, as 
shown in Figure 4, although as previously mentioned motion vector compensation 
25 is not necessarily required at all inihe present invention. 

As shown m figure 4, fiame by fiame input is appUed at an input 302, with the 
intra-fiame data being passed to an intra-frame transform 304 and then to a 
matching pursuits coder or atom finder 303. The atom the amplitude is then 
30 quantized at 305. The inter-frame data is passed to a motion estimator 306 which 
provides a parametised motion description on line 308. this then being passed to a 
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motion compensator 3 10. The motion compensator outputs a predicted frame 
along a line 312 which is subtracted from the input frame to provide a residual 
frame 3 14 which is passed to a residual transfonn 3 16. The transfonn output is 
applied to a matching pursuits coder 309 and then to a quantizer 307 which outputs 
quantized codes to the output stream. 

The motion description on line 308 is also passed to a motion description coder 
320, which codes the description and outputs motion data on a Une 322. 

The output stream thus consists of coded intta-frame data, residual data and motion 
data. 

The output stream is fed back to a reference decoder 324 whidi itself feeds back a 
reference frame (intra or inter) along lines 326, 328 respectively to the motion 
compensator and the motion estimator. In that way, the motion compensator and 
the motion estimator are always aware of exactly what has just been sent in flie 
output stream. Ihe reference decoder 324 may itself be a fiill decoder, for example 
as illustrated in Figure 5. 

Generally, &e motion vectors may be derived by comparing a successor frame with 
the decompressed previous frame; in the alternative, the original previous frame , 
could be used. . In either case, the residual frames are calculated as the difference 
between the predicted fi^e and the original successor frame. In a variation (not 
shown) of the embodiment, tiie frames being compared might be pre- to improve 
the motion vectors. 



10 
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The output stream travels across a communications network and, at the other end, 
IS decoded by a decoder which is shown schematically in Figure 5, The intra- 
infonnation m the data stream is suppUed to an intra-frame decoder 410, which 
provides decoded intra-frame information on a line 412. The inter-information is 
supphed to a bus 414. From that bus. the residual data is transmitted along a line 
416 to a residual decoder 418. Simultaneously, the motion data is supplied along i 
line 420 to a motion compensator 422. The outputs from the residual decoder and 
the motion compensator are added together to provide a decoded inter-frame on 
line 423. 



1 5 Referraice frame information is fed back along a line 424 to the motion 

compensator, so that the motion compensator always has current detaUs of both the 
output from and the input to the decoder. 

It wiU be understood of course that the invention is not restricted to use with the 
20 type of motion-compensated video coder as shown in Hgure 4: it may be used m 
any type of video coder, where the output from the main transform needs to be 
quantized. 

This approach may be used not only for video compression, as previously 
25 described, but also for still image compression. 

A further embodiment, in which the raw input data is for example representative of 
a still image, is shown in figure 6. Htte, tiie iiiput data/image 40 is first 
transformed/compressed in some way (e.g. by means of a wavelet transform 41), to 
30 produce a transformed image 42. That image is then quantized by means of a 
matchmg pursuits coder and quantizer 43 to produce the final coded output 44. 
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5 The wavelet transfonn 41 could be replaced with any other convenient 

compression transform such as an EFT a DCT, or a Lapped Orthogonal transform. 

In the example shown in Figure 6, the image 40 undergoes a wavelet transform 
which spUts the image up into several ^tially-filtered sections or sub-bands 45, 

10 46, 47. 48. Sections 46 and 47 have been highpass filtered in one direction and low 
pass filtered in auotiier, which means that those two sub-bands are better 
decorrelated in one direction than in the other. It wiU be understood, of course, 
that a horizontal transform could be foUowed by a vertical, or vice versa. After 
raster scamiing those sub-bands as indicated by the reference numerals 400, 401 , a 

1 5 one-dimensional matching pursuits quantization is then used. A fakly small 

matching pursuits code book may be used, for each direction, since tiie finding of 
structure within the image at different scales has akeady been automatically carried 
out by the wavelet transform: that no longrar needs to be carried out by the 
matching pursuits algoriflim. 

20 

The use of ID matching pursuits algorithms to quantize the output of a 2D • 
transform is ^pUcable not only to wavelet transforms but to any other 2D 
transforms which decorrelate better in one direction than in anotiier (at least over 
part of the area of the output). Generally, the output of the transform may 
0 25 automatically divide the data up into a number of different partitions, and the 

partitions may then be individuaUy scanned, each in a separate preferred direction. 

It is envisaged that the matching pursuits algorithm may be applied to &e output of 
any multi-dimensional decorrelating transform, preferably a ftequency transform. 



Claims 
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1 . A method of data compression comprising f^plyiag a Hansfonn to multi- 
dimensional data to generate a multi-dimensional transform data set, and 
coding the transform data set by applying a one or more one-dimensional 
matching pursuits algorithms. ■ " ' 

2, A method ofdata compression comprising: 

(a) applying a transform to multi-dimensional data to generate a multi- 
dimensional transform data set; 

(b) convolving the transform data set with each of a pluraUty of first one- 
dimensional basis functions to generate ai corresponding plurality of 
convolved data sets; 

(c) determining a location in a first direction across all the convolved 
data sets, and a first basis fimction, representative of a greatest 
magnitude; 

(d) representing part of the transform data suiiounding the said location 
v^^ith an atom derived fiom the first and second basis functions 
corresponding to the greatest determined magnitudes; 

(e) subtracting the atom fix>m the transform data set to create a new data 
set; 

(f) repeatedly updating the convolved data sets by convolving any 
changed part of the transform data set wifli each of the plurality of 
first one-dimensional basis functions, and then re-applying steps (c) 
and (d); and 

• £5. 
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(g) outputting as transfonn data coded versions of the atoms derived at 
step(d). 

A method of data compression as claimed in claim 2 in which the coded 
version of each atom includes magnitude, position in transfonn data set and 
number of basis ftinction. 

A method of data compression as claimed in any one of the preceding 
claims in which the date to be compressed represents video image data. 

. A method of data compression as claimed in any one of claims 1 to 3 in 
vAich the date to be compressed represents a still image. 

. A method of data compression as claimed in any one of claims 1 to 3 in 
which the data to be compressed comprises residual images within a niption 
compensated video coder. 

r. A method of data compression as clahned in any one of claims 1 to 3 m 
which one dimension of the transform date set represents tune. 

8. A method of date compression as claimed in any one of clauns 1 to 3 in 
which the transform is a frequency-separating transfonn. 

9. A method of date compression as claimed in claim 8 m vMch the transform 
deconelates at least part of the transfonn date set better in one direction than 
in a perpendicular direction, and in which a first algorithm is appUed by 
canying otrt a one-dimensional scan in the direction of greatest correlation. 
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10. A method of data compression as claimed in claim 1 in which the transfonn 
is two-dimensional. 

1 1 . A method of data compression as claimed in claim 1 in which the 
algorithms are appUed by sequential one-dimensional scans tough &e 
data. 



12. A method of data compression as claimed in claim 1 1 in which the scans 
successively switch between directions withm the data. 

13. A method of data compression as claimed in claim 11 in which successive 
scans continue in the same direction until an atom is located of lower 
magnitude than atoms which have previously been located in scans in other 
directions, and in which the scan direction is then dmnged. 

14. A method of data compression as claimed in claim 13 in which the scan 
direction is changed to that direction in which an atom of highest current 
magnitude as previously been located. 

15. A method as claimed m claim 2 including applying a fonctionmap to the 
convolved data sets before deteimining the location of greatest magnitude. 

16. A method as claimed in claim 15 in which the function map represents a 
sensory model. 
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17. A method as claimed in claim 15 in which the fonction map represaits a 
psychoacoustic model. 



18. A method as claimed in claim 15 in which the. fonction map represents a 
psychovisual model.. 

19. A method as claimed in anyone of claims 15 to 18 in which the fonction 
map is multiplicatively applied. 

20. A method as claimed in any one of claims 15 to 19 in which the fonction 
m^ is additively or subtractively appUed. 

21. A method as claimed in claim 2 in which the second one-dimensional basis 
functions extend in tiie spatial domain. 

1 

22. A method as claimed in claim 2 in which the second one-dimensional basis 
fonctions extend in the time domain. 



23 . A metiiod as claimed in claim 2, including die additional steps o£ 

(a) convolving die transform data at the said location widi each of a 
plurality of third one-dimensional basis fonctions; and 

(b) determining a third basis fonction of a greatest magnitude; 

and in which the atom is forther derived fix)m die diird basis fonction 
corresponding to the greatest determined magnitude. 
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24. A method as claimed in claim 2, in which the second basis function 

representative of the greatest magnitude is determined without fiirther 

searching in the region of the said location. 

■ • 
. • • • 

25. A method as claimed in claim 2, in which the second basis function 
representative of the greatest magnitude is determined at least partly by 
searching a local area in the region of the said location. 

26. A method of data compression comprising: 

(a) applying a transform to multinlimensional data to generate a multi- 
dimensional transform data set; 

(b) convolving the transform data set with each of a plurality of first one 
dimensionsil basis ftmctions to generate a corresponding plurality of 

- convolved data sets; 

(c) determining a first location in a first direction across all the 
convolved data sets, and a first basis function representative of a 
greatest magnitude; and representing part of the transform data 
sxxcrounding the first location with a first atom derived firom the first 
function corresponding to die greatest determined magnitude; 

(d) subtracting the first atom firom the transform data set to create a new 
data set; 

(e) convolving the new data set with each of a plurality of second one- 
dimensional basis fimctions; 

(f) determining a second location in a second direction across all the 
convolved data sets, and a second basis function representative of a 
greatest magnitude; and representing part of the new data set 
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surrounding a second location with a second atom derived from the 
second function corresponding to the greatest determined magnitude; 

(g) subtracting liie second atom from the new data set to create a fortiier 
new data set; 

(h) repeating step (b) witii the further new data set, and Ihen re-applying 

steps (c) to (f); and 

(i) outputting as quantized transform data coded versions of the atoms 

derived at steps (c) and (f). 

27. A method of data compression as claimed in claim 26 in which the first 
location and the second location are coincident 

28. A coder for data compression comprising means for applying a transform to 
time-varying data to generate a multi-dimensional transform data set, and a 
coder for coding the transform data set by applying a pluraUty of one- 
dimensional matching pursuits algorithms, one for each dimension. 



29. A coder for data compression comprismg: 



(a) means for applying a transform to multi-dimensional data to 
generate a multi-dimensional transform data set; 

(b) means for convolving the transform data set with each of a 
plxirality of first one-dimensional basis ftmctions to generate a 
corresponding plurality of convolved data sets; 

(c) means for detennining a location in a first direction across all 
the convolved data sets, and a first basis function 
representative of a greatest magnitude; 
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(d) 



means for representing part of the transfonn data surrounding 
the said location with an atom derived from the first function 
corresponding to the greatest determined ma^tudes; 

(g) ™^ for subtracting the atom from the transform data set to 
create a new data set; 

(h) means for repeatedly updating the convdived data sets by 

convolving any changed part of the transform data set with 
each of the pluraUty of first one-dimensional basis fimctions; 
and 

(i) means for ou%)utting as transform data coded versions of the 
derived atoms. 

30. A coder for data compression as claimed in Claim 29 including: 

(cl) means for convolving the transform data at the said location 
with each of a pluraUty of second one-dimensional basis 
functions; and 

(c2) means for determining a second basis function representative 
of a greatest magnitude; 

and in ^^ch the means for representing part of the transform data fiather 
operates upon the second basis functions. 

3LA coder for data compression comprising: 

(a) means for applying a teansfonn to multiKiimensional data to generate 
a multi-dimensional transform data set; 
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(b) means for convolving the transform data set with each of a plurality 
of fiist one-dimensional basis functions to generate a corresponding 

plurality of convolved data sets; 

(c) means for detennining a first location in a first direction across all the 

convolved data sets, and a first basis fimction r^resentative of a 
greatest magnitude; and representing part of the transform data 
surrounding the first location with a first atom derived fix>m the first 
fimction corresponding to the greatest determined magnitude. 

32. A coder for data compression comprisii^: 

(a) means for applying a transform to niulti-dimensional data to generate 
a multi-dimensional tramform data set; 

(b) means for convolving the transform data set with each of a pluraUty 
of first one-dimensional basis functions to generate a corresponding 
plurality of convolved data sets; 

(c) means for detennining a first location in a first direction acipss aU the 
convolved data sets, and a first basis function representative ofa 
greatest magnitude; and representing part of the transform data 
surrounding flie first location with a first atom derived from the first 
fimction corresponding to the greatest determined magnitude; 

(d) means for subtracting the first atom from the transform data set to 

create a new data set; 

(e) means for convolving the new data set witii eadi ofa plurahty of 

second one-dimensional basis functions; 

(f) means for determining a second location in a second direction across 

aU the convolved data sets, and a second basis function representative 
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of a greatest magnitude; and representing part of the new data set 
sxirrounding a second location willi a second atom derived from the 
second function corresponding to the greatest detennined magnitude; 

(g) means for subtracting the second atom from the new data set to create 
a further new data set; 

(h) means for repeating step (b) wifli the furttier new data set, and tiien 
re-applying steps (c) to (f); and 

(i) means for outputting as tiansform data coded versions of the atoms 
dCTived at steps (c) and (f). 

33. A codec including a coder as claimed in any one of claims 28 to 32. 

34. A computer program for carrying out a method as claimed in any one of 
claims 1 to 27. 



35. A machine-readable data carrier carrying a computer program as claimed in 
claim 34. 



36. A method as claimed in Claim 2 including the fiirther steps of: 

(cl) convolving the transform data at tiie said location with each of a 
plurality of second one-diraoisional basis functions; 

(c2) determining a second basis function representative of a greatest 
magnitude; 

and including, at step (d), representing part of the ti^form data 
surrounding the said location witii an atom derived both from the first and 
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from the second basis functions corresponding to flie greatest detennined 
magnitudes. 

37. A method of data compression as claimed in claim 11 in which the 
sequential one-dimensional scans through the data are orthogonal. 



34 

Abstract 

A method and apparatus for data compression comprises applying a decoirelating 
transfomx to multi-dimensional data to be compressed, then using a sequence of 
one or more one-dimensional matching pursuits algorithms to code the output of 
the transform. The invention finds particular application in video and still image 
coders, particularly real-time coders, those intended for use at very low bit rates, 
and those for which scalable bit rates are of importance. 



(Figure 2) 
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